38 research outputs found
A 3D Framework for Characterizing Microstructure Evolution of Li-Ion Batteries
Lithium-ion batteries are commonly found in many modern consumer devices, ranging from portable computers and mobile phones to hybrid- and fully-electric vehicles. While improving efficiencies and increasing reliabilities are of critical importance for increasing market adoption of the technology, research on these topics is, to date, largely restricted to empirical observations and computational simulations. In the present study, it is proposed to use the modern technique of X-ray microscopy to characterize a sample of commercial 18650 cylindrical Li-ion batteries in both their pristine and aged states. By coupling this approach with 3D and 4D data analysis techniques, the present study aimed to create a research framework for characterizing the microstructure evolution leading to capacity fade in a commercial battery. The results indicated the unique capabilities of the microscopy technique to observe the evolution of these batteries under aging conditions, successfully developing a workflow for future research studies
In-Datacenter Performance Analysis of a Tensor Processing Unit
Many architects believe that major improvements in cost-energy-performance
must now come from domain-specific hardware. This paper evaluates a custom
ASIC---called a Tensor Processing Unit (TPU)---deployed in datacenters since
2015 that accelerates the inference phase of neural networks (NN). The heart of
the TPU is a 65,536 8-bit MAC matrix multiply unit that offers a peak
throughput of 92 TeraOps/second (TOPS) and a large (28 MiB) software-managed
on-chip memory. The TPU's deterministic execution model is a better match to
the 99th-percentile response-time requirement of our NN applications than are
the time-varying optimizations of CPUs and GPUs (caches, out-of-order
execution, multithreading, multiprocessing, prefetching, ...) that help average
throughput more than guaranteed latency. The lack of such features helps
explain why, despite having myriad MACs and a big memory, the TPU is relatively
small and low power. We compare the TPU to a server-class Intel Haswell CPU and
an Nvidia K80 GPU, which are contemporaries deployed in the same datacenters.
Our workload, written in the high-level TensorFlow framework, uses production
NN applications (MLPs, CNNs, and LSTMs) that represent 95% of our datacenters'
NN inference demand. Despite low utilization for some applications, the TPU is
on average about 15X - 30X faster than its contemporary GPU or CPU, with
TOPS/Watt about 30X - 80X higher. Moreover, using the GPU's GDDR5 memory in the
TPU would triple achieved TOPS and raise TOPS/Watt to nearly 70X the GPU and
200X the CPU.Comment: 17 pages, 11 figures, 8 tables. To appear at the 44th International
Symposium on Computer Architecture (ISCA), Toronto, Canada, June 24-28, 201